-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix registry unsupported pipeline update #96497
Fix registry unsupported pipeline update #96497
Conversation
// Test that stats are serializable and can be gathered | ||
getTrainedModelStats(); | ||
} | ||
// Test that stats are serializable and can be gathered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidkyle Reverted your workaround.
for (int i = 0; i < 10; i++) { | ||
assertInfer(modelId); | ||
} | ||
waitForDeploymentStarted(modelId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davidkyle Reverted your workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afoucret. I added the test-full-bwc
label for extra CI coverage. The label means the upgrade tests will be run against all backwards compatible versions
@@ -253,4 +258,13 @@ protected boolean requiresMasterNode() { | |||
// there and the ActionNotFoundTransportException errors are then prevented. | |||
return true; | |||
} | |||
|
|||
@Override | |||
protected boolean isClusterReady(ClusterChangedEvent event) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@eyalkoren Now checking all nodes are at least at v 8.9.0, so it does not fails because of the ignore_missing_pipeline parameter
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are you preventing with this?
AFAIK, once the local implementation of requiresMasterNode()
returns true
, as in this case, this ensures that the upgrade will occur only on the elected master. I believe that during rolling upgrades this ensures that this happens only after all non-master nodes are already upgraded.
Since the usage of ignore_missing_pipeline
was introduced in #95971, which was added to 8.9.0 and not back-ported, I am not sure whether this is required.
BTW, AnalyticsTemplateRegistry
also requires master node, so double-check if this is required in the original case as well.
@jbaiera @dakrone please confirm or enlighten me if I got it wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no guarantee that the non-master nodes will be upgraded first during a rolling upgrade. This is a sensible precaution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would nodes apply settings coming from a master of a higher version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dakrone just explained that I indeed got it wrong and there is no actual enforced guarantee as to the order of upgrade in code. This should normally not occur if the upgrade is done according to documentation (master node last), but such verification does make sense
Pinging @elastic/ent-search-eng (Team:Enterprise Search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@@ -253,4 +258,13 @@ protected boolean requiresMasterNode() { | |||
// there and the ActionNotFoundTransportException errors are then prevented. | |||
return true; | |||
} | |||
|
|||
@Override | |||
protected boolean isClusterReady(ClusterChangedEvent event) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no guarantee that the non-master nodes will be upgraded first during a rolling upgrade. This is a sensible precaution.
for (int i = 0; i < 10; i++) { | ||
assertInfer(modelId); | ||
} | ||
waitForDeploymentStarted(modelId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @afoucret. I added the test-full-bwc
label for extra CI coverage. The label means the upgrade tests will be run against all backwards compatible versions
Thank you @davidkyle and @eyalkoren for your careful reviews! |
I had this on my list to look at today (just found out about it this morning). Next time since the Data Management team owns the |
Closes: #95766
The update tests were failing because pipeline are using new processor config that have been introduced recently.
ignore_missing
param of theuri_parts
processor (introduced in8.8.0
)ignore_missing_pipeline
param of thepipeline
processor (introduced in8.9.0
)Both registry are now installed only when all the modes in the cluster are updated to the right version.